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Abstract 

Microevolution and origins of Bradyrhizobium populations associated with soy- 
beans at two field sites (A and B, 280 km apart in Canada) with contrasting 
histories of inoculation was investigated using probabilistic analyses of six core 
(housekeeping) gene sequences. These analyses supported division of 220 iso- 
lates in five lineages corresponding either to B. japonicum groups 1 and la or 
to one of three novel lineages within the genus Bradyrhizobium. None of the 
isolates from site A and about 20% from site B (the only site with a recent 
inoculation history) were attributed to inoculation sources. The data suggest 
that most isolates were of indigenous origin based on sequence analysis of 148 
isolates of soybean-nodulating bacteria from native legumes {Amphicarpaea 
bracteata and Desmodium canadense). Isolates from D. canadense clustered with 
B. japonicum group 1, whereas those from A. bracteata were placed in two 
novel lineages encountered at soybean field sites. One of these novel lineages 
predominated at soybean sites and exhibited a significant clonal expansion 
likely reflecting selection by the plant host. Homologous recombination events 
detected in the 35 sequence types from soybean sites had an effect on genetic 
diversification that was approximately equal to mutation. Interlineage transfer 
of core genes was infrequent and mostly attributable to gyrB that had a history 
of frequent recombination. Symbiotic gene sequences {nodC and nifli) of 
isolates from soybean sites and native legumes clustered in two lineages corresponding 
to B. japonicum and B. elkani with the inheritance of these genes appearing 
predominanfly by vertical transmission. The data suggest that soybean-nodulating 
bacteria associated with native legumes represent a novel source of ecologically 
adapted bacteria for soybean inoculation. 



Introduction 

The genus Bradyrhizobium includes species of economi- 
cally important soil bacteria that fix atmospheric nitrogen 
in symbiotic association with soybeans {Glycine max) and 
thereby minimize the requirement for nitrogen fertilizer 
inputs in crop production. 

In some Bradyrhizobium lineages, the accessory genes 
encoding symbiotic functions (nodulation and nitrogen 
fixation) reside in a chromosomally located symbiosis 
island region that has potential for lateral transfer (Kane- 
ko et al. 2002, 2011). The evolutionary histories of genes 
affecting symbiotic functions in the bradyrhizobia may 
differ from those of core (housekeeping) genes not 
directly involved in symbiosis (Stepkowski et al. 2005, 



2007; Steenkamp et al. 2008). Therefore, microevolution- 
ary and population genetics studies of bradyrhizobia 
require separate analysis of symbiotic and core genes. 

Soybean is the most important grain legume in the 
world on a production basis and is a major source of oil 
and protein. This legume was domesticated in China 
about 4000 years ago with subsequent cultivation in 
secondary centers of domestication including India, Thailand, 
Japan, Korea, and Indonesia (Smartt and Hymowitz 
1985). Soybeans were first introduced into the Americas 
in the late 18th century, but commercial production did 
not start in the United States and Canada until the 1920s 
(http://www.soyinfocenter.com/bibliographies.php) . Most 
modern soybeans originate from a narrow genetic base 
and consequently have limited genetic variability (Delannay 
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et al.l983). Therefore, it is not surprising that only four 
Hneages in the genus Bradyrhizobium {B. japonicum, 
B. elkani, B. liaoningense, and B. yuanmingense) have been 
found to nodulate soybeans under field conditions and all 
have been encountered in the primary and/or secondary 
centers of soybean domestication (e.g., Appunu et al. 
2008; Vinuesa et al. 2008; Li et al. 2011). As soybean is 
exotic to North America, soils without a cropping history 
of this legume are considered to contain few symbiotic 
bacteria capable of nodulating soybeans (Weaver et al. 
1972; Semu et al. 1979). Consequently, soybeans introduced 
into new environments are inoculated with effective 
nitrogen-fixing strains of Bradyrhizobium. Early work in 
the United States employing nonmolecular methods such 
as serotyping suggest that introduced strains often occupy 
a high proportion of root nodules in the first season, but 
are progressively replaced in subsequent seasons by heter- 
ogeneous symbiotic bacteria resident in soil (reviewed by 
Streeter 1994). As a result of enrichment by the host 
plant, populations of bradyrhizobia have become well 
established in soils of soybean-growing regions in the 
United States (Weaver et al. 1972). 

Diverse native legumes are distributed throughout the 
soybean growing regions of North America that are 
known to associate with populations of bradyrhizobia 
(Sterner and Parker 1999; Parker and Kennedy 2006). 
For example, Amphicarpaea bracteata (Hog Peanut) is a 
close relative of soybean (Doyle and Doyle 1993; Zhu 
et al. 1995) and plant infection tests have indicated that 
Bradyrhizobium isolates from this host readily elicit root 
nodules on soybeans (Marr et al. 1997). Despite these 
observations, the origins of soybean-nodulating bacteria 
that frequently outcompete introduced strains in soybean 
crop ecosystems have yet to be investigated. 

Previously, we made a collection of symbiotic bacteria 
(220 isolates) that were isolated from soybean cultivars 
inoculated with soil from two field sites (A and B, about 
280 km apart in eastern Canada) with contrasting histories 
of soybean cultivation and Bradyrhizobium inoculation. 
Simultaneously, we assembled a reference collection of 
Bradyrhizobium strains known to have been used in inoc- 
ulants for soybeans in Canada. During the course of this 
study, we observed consistent nodulation of soybeans fol- 
lowing inoculation with root-zone soils collected from 
native legumes (A. bracteata and Desmodium canadense) 
growing in natural woodland habitats. A further collec- 
tion (148 isolates) of these soybean-nodulating bacteria 
was made for comparative analysis. 

Our objectives were to infer the microevolutionary his- 
tories of bradyrhizobia at soybean field sites A and B as 
well as the extent to which they originated from inocula- 
tion sources or from populations of symbiotic bacteria 
associated with legumes native to eastern Canada. As 
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recombination is a major evolutionary force that influ- 
ences the structure of bacterial populations (Fraser et al. 
2010), we investigated the effect of homologous recombi- 
nation on genetic diversification of bacteria from sites A 
and B. 

Genetic characterization of bradyrhizobia was by 
multilocus sequence typing (MLST) (Maiden 2006) of six 
core {atpD, glnll, recA, gyrB, rpoB, and dnaK) and two 
symbiotic (nodC and nifH) genes. Microevolutionary his- 
tories and recombination events were inferred using 
model-based Bayesian approaches that take into account 
the effect of recombination (Pritchard et al. 2000; Didelot 
and Falush 2007) as well as conventional maximum-likeli- 
hood (ML) methods. 

Materials and Methods 

Site description, soil sampling, and bacterial 
isolation 

(a) Soybeans 

Two field sites (A and B), about 280 km apart, were 
selected for soil sampling based on contrasting histories 
of soybean cultivation and inoculation. Site A consists of 
an experimental plot on the Central Experimental Farm 
(CEF), Ottawa, Ontario (latitude 45° 23'09.04" N; longi- 
tude 75° 43' 10.99" W). Soybeans were cultivated at the 
CEF since 1897, but at site A, they were first grown and 
inoculated (either with soil from a soybean field or bacte- 
rial culture) in 1939; from about 1970, soybeans were 
grown without inoculation. The soil is fine sandy loam 
(Melanic Brunisol), pH 6.7 (water). 

Site B is a farmer's field at St Hugues, Quebec (latitude 
45° 50' 25.84" N; longitude 72° 52' 03.63" W). The soil is 
poorly drained clay loam (Orthic Humic Gleysol), pH 7.0 
(water). This site had no history of soybean cultivation 
until 1992 when soybeans were introduced and inoculated 
(Nitragin Co., Milwaukee, WI); soybeans were subse- 
quently grown and inoculated in each of 5 years with 
corn [Zea mays) cultivated in intervening years. 

At the time of soil sampling (July 1999), soybean cultivar 
AC Maple Glen was grown at both sites; all plants examined 
were well nodulated. Thirty soil samples (15-cm depth) 
were collected with aseptic precautions from the vicinity of 
soybean roots, but otherwise at random, from a 25-m'^ area 
at each site. Soil samples were pooled to form a composite 
for each site and maintained at 4°C before use. 

Isolation of bradyrhizobia from soU samples A and B 
was done (within 7 days of soil sampling) using soybean 
cultivars AC Maple Glen and AC Orford (subsequently 
abbreviated M and O, respectively) as trap plants; 
both cultivars are short season soybeans used in eastern 
Canada. A soil suspension (fivefold dilution in water) of 
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each soil sample was mixed for 20 min and 10-mL 
aliquots used to inoculate soybean seedlings from surface- 
sterUized seed planted in Leonard jars (Vincent 1970) 
containing vermiculite and supplied with nitrogen-free 
nutrient solution (Bromfield et al. 1995); controls 
consisted of uninoculated plants. Plants were maintained 
in a controlled environment chamber for 35 days at 25°C 
(16 h day) 17 °C (8 h night). Bradyrhizobia were isolated 
from surface-sterilized nodules (Bromfield et al. 1995) 
taken at random from roots of plants in 10 replicate jars 
(two plants per replicate) for each soil and cultivar 
combination; uninoculated plants were without nodules. 

Bacteria were grown at 28°C using yeast-extract-manni- 
tol (YEM) agar (Vincent 1970), modified to contain 
(gL^'): 1.5 yeast extract (Oxoid, Basingstoke, Hampshire, 
UK) and 1.0 mannitol; 0.15 cycloheximide ( Sigma- Aldrich, 
Oakville, Ontario, Canada) was added to inhibit fungal 
growth for bacterial isolation. Bacteria were purified by 
streaking and single colony picking and maintained at 
— 80°C in 20% (w/v) glycerol for subsequent analysis. 

The number of viable bradyrhizobia in soil samples A 
and B was estimated by most probable number (MPN) 
method (Vincent 1970) using soybean cultivars M and 
O grown in pouches (Mega International, West St. Paul, 
Minnesota) and fivefold soil dilutions. The test was car- 
ried out in 1999 within 14 days of soil sampling. 

(b) Native legumes 

Thirty root-zone soil samples were collected with aseptic 
precautions (10-cm depth) from each of the native 
legumes, D. canadense (tribe Desmodieae) and A. bracteata 
(tribe Phaseoleae) growing in natural woodland habitats 
in Quebec: Aylmer (45°22'48.21"N 75°48'5.52"W) and 
Donnacona (46°41'04.83"N 71°44'28.05"W), respectively. 
These native legumes were chosen because they are fairly 
common and are distributed throughout the soybean 
growing regions of eastern Canada. Suspensions of 
composite soil samples representing each legume were 
prepared and aliquots inoculated onto seedlings of 
soybean cultivars O and AC Glengarry grown in Leonard 
jars. AU procedures, including plant growth and bacterial 
isolation, were carried out as described in the preceding 
section. AC Glengarry replaced AC Maple Glen for use in 
east Canada in 2000 (Cober et al. 2007). 

Bradyrhizobia 

Totals of 220 (from soybean field sites A and B) and 148 
(from D. canadense and A. bracteata) bacterial isolates 
were analyzed. All bacterial isolates from soybeans (sites 
A and B) and native legumes were slow growing on YEM 
agar (colony diameter <1 mm after 7-21 days at 28°C). 



Reference bradyrhizobia included eight strains known 
to have been used in Canadian commercial soybean inoc- 
ulants, 10 named species (type strains) and the photosyn- 
thetic bacterium, Bradyrhizobium sp. BTAil (Table 1; 
Tables S2-S5). 

Nucleotide sequencing 

Partial sequences of six chromosomally encoded core genes 
(flfpD, recA, gyrB, rpoB, and dnaK) were generated 

for the 220 bacterial isolates from sites A and B as well 
as for reference strains not available in Public databases. 
Partial recA gene sequences were generated for all 148 
bacterial isolates from native legumes together with partial 
dnaK sequences for selected isolates. The six core genes 
used in this study were selected based on their relatively 
uniform distribution on the chromosomes of B. japonicum 
USDAe"^ (Kaneko et al. 2011) and USDAllO (Kaneko 
et al. 2002) (Table SI) as well as their previous use in phy- 
logenetic studies of Bradyrhizobium sp. (Vinuesa et al. 
2008; Menna et al. 2009; Rivas et al. 2009). 

To assess phylogenetic relationships based on genes 
located on the symbiosis island region of the bacterial 
chromosome, partial sequences of the nodC gene (encod- 
ing nodulation protein C involved in nodulation factor 
synthesis) were generated for all 220 isolates from field sites 
A and B as well as for selected isolates from native legumes; 
partial sequences of the nifH gene (encoding metallopro- 
tein II of the nitrogenase enzyme responsible for biological 
nitrogen fixation) were generated for selected isolates. 

Preparation of genomic DNA, amplification, nucleotide 
sequencing, and sequence editing was as described by 
Bromfield et al.(2010), except that amplifications were per- 
formed using a TProfessional thermocycler (Biometra, 
Goettingen, Germany) with 10-/iL reaction mixtures con- 
taining 1-1.5 ng DNA, 0.1 mmol/L each dNTP, 
0.08 /(mol/L each primer, 0.5 x Titanium Taq DNA poly- 
merase (Clontech Laboratories Inc., Mountain View, Cali- 
fornia), and 1 X buffer that was supplied with the enzyme. 

Primers for amplification and sequencing were derived 
from the literature and, together with temperature and 
polymerase chain reaction (PGR) cycling conditions, are 
shown in Table S6. GenBank accession numbers of the 
1779 nucleotide sequences generated in this study are 
listed in Tables S2-S5. 

Analysis of sequence data 

Sequences were read in frame and aligned using ClustalW 
and RevTrans version 1.4 (Wernersson and Pedersen 
2003) taking into account corresponding amino acid 
alignments. Editing of alignments was done based on 
protein-encoding genes using Mega5 (Tamura et al. 



2012 The Authors. Published by Blackwell Publishing Ltd 



2945 



Microevolution of Bradyrhizobium populations J. Tang et al. 

Table 1. Bradyrhizobium strains used in soybean inoculants. 

Strain and alternative Year Core lineage/ 

designation introduced Origin/Characteristics/Reference ST nodC group 



61A101 
61A124 

B32C (61A152, SEMIA 5039) 

USDA138 (61A118, SEMIA 5028) 

USDA 136 (CB1809, RCR3407, 
TAL379, 61A136, SEMIA 0586) 

USDA122 

USDA110 (3l1b110, TALI 02) 



USDA142 (61A148, 3l1b142, 
SEMIA5058) 



1974, 

discontinued 1987* 
1974, 

discontinued 1986* 
1 990" 



1974* 
1 980" 



See USDA136 



1980+ 



1974* 



Nitragin Co., Milwaukee, Wl; isolated Illinois; 

serogroup C3 
Nitragin Co., Milwaukee, Wl; isolated New 

Zealand 

Isolated Brazil from soybeans inoculated with 

strains from the United States of America 

(Santos et al. 1999) 
USDA/ARS, Beltsville, MD, isolated Mississippi, 

1961; serogroup 6 (Keyser and Griffin 1987) 
USDA/ARS, Beltsville, MD; reisolate of 

USDA122; serogroup 122 (Keyser and 

Griffin 1987) 
USD/VARS, Beltsville, MD; isolated Mississippi, 

1960. Parent strain of USDA136; serogroup 122 

(Keyser and Griffin 1987) 
USD/VARS, Beltsville, MD; isolated Florida, 

1959; serogroup 1 10 (Keyser and 

Griffin 1987) 
Isolated India, 1973; serogroup 122 

(Keyser and Griffin 1987) 



32 
NA 
24 

24 
4 



\/nod II 
NA 

y/nod I 

V/nod I 
IV/nod I 

N/nod I 

N/nod I 

N/nod I 



Data are for inoculant strains used in Canada up to 1999 (the time of soil sampling and bacterial isolation); no information available for proprie- 
tary strains. NA, not recovered from field sites A or B. Bradyrhizobium japonicum USDA6^ has the same multilocus genotype as strain 532C. Strain 
532C was supplied by T. Wacek, BeckerUnden/vood/Urbana, strains 61A101 and 61A124 by the Nitragin Co, and the remaining strains by P. van 
Berkum, USDA/ARS. 

Information source: 'S. Smith, Nitragin Co., Milwaukee, Wl. 
■''T. Wacek, Becker Underwood/Urbana, St. Joseph, MO. 
'D. Blair, Central Food Inspection Agency, Ottawa, Canada. 



2011); alignments were trimmed, so that sequences of 
each gene were the same length. 

Core gene sequences that differed from each other by 
one or more polymorphisms were identified using the 
unique.seqs command implemented in Mothur 1.2 (Schloss 
et al. 2009). Sequences were concatenated using BioEdit 7 
(http://www.mbio.ncsu.edu/bioedit/bioedit.html) and each 
unique allelic profile assigned a sequence type (ST) 
number. 

Population structure and ancestry 

Population structure and ancestry of the 220 bacterial iso- 
lates from field sites A and B was inferred using the 
admixture model with independent allele frequencies 
(Pritchard et al. 2000) implemented in STRUCTURE 
version 2.3. Sequence data (six core genes) were formatted 
using xmfa2struct (http://www.xavierdidelot.xtreemhost. 
com/clonalframe.htm). STRUCTURE uses a Bayesian 
clustering framework and assumes that the observed data 
are derived from K ancestral populations (lineages). 
The admixture model allows for the possibility that 
individuals may have mixed ancestry in more than one of 
the K populations. 



Five replicate Markov Chain Monte Carlo (MCMC) 
runs were performed for each value of K ranging from 2 
to 7 using 100,000 burn-in and 200,000 sampling iterations. 
The ad hoc approach described in the software documen- 
tation was followed to select a value for K as well as the 
additional criterion that an ancestral population must 
contribute >50% genetic material to at least one individ- 
ual to be recognized. Software CLUMPP (Jakobsson and 
Rosenberg 2010) was used to account for label switching 
in five replicate STRUCTURE runs of the selected K. 

Phylogenetic relationships between 
bradyrhizobia and reference strains 

ML phylogenetic analyses were carried out using PhyML 
version 3.0 (Guindon et al. 2010) and the substitution 
model GTR-FG-FI, selected on the basis of the Akaike 
information criterion implemented in jMODELTEST ver- 
sion 0.1 (Posada 2008). Settings used in PhyML were as 
follows: five random starting trees, estimated gamma 
shape parameter (four substitution rate categories), esti- 
mated proportion of invariable sites, and SPR and NNI 
tree improvement algorithms. ML trees were recon- 
structed using unique STs for concatenated core gene 



2946 



2012 The Authors. Published by Blackwell Publishing Ltd. 



J. Tang et al. 



Microevolution of Bradyrhizobium populations 



sequences and unique sequences for symbiotic genes; 
1000 nonparametric bootstrap replications were used to 
assess support. Trees were drawn using MegaS software. 

Data for 1000 replicate bootstrap trees from the ML 
analysis of six concatenated core gene sequences were 
imported into SplitsTree version 4.1 (Huson and Bryant 
2006) and used to construct a consensus network graph. 

A randomization test was carried out to assess the 
extent of congruence between ML trees reconstructed for 
each of the six core genes employing custom R scripts (R 
Development Core Team 2009) based on the phangorn 
phylogenetic package (Schliep 2011). This test assesses 
whether ML trees for different genes are more similar to 
each other than to trees of random topology (Feil and 
Spratt 2001). One hundred random trees were generated, 
and, following the method of Feil and Spratt (2001), 11 
unique STs were selected to represent the different 
STRUCTURE lineages from a tree inferred by Clonal- 
Frame (see below). An example of R code used in these 
calculations is given in Table S7. 

ClonalFrame analysis of recombination 

To further investigate microevolutionary relationships and 
to infer recombination events, 50% majority rule consensus 
trees were computed for sequence data of the six core genes 
using ClonalFrame version 1.2 (Didelot and Falush 2007). 

ClonalFrame employs a Bayesian framework to infer clo- 
nal relationships while taking into account recombination. 

Five independent MCMC runs, with and without cor- 
rection for recombination, were performed, using sequence 
data for 220 bacterial isolates from field sites A and 
B. Each run consisted of one million iterations. The first 
500,000 iterations of each run were discarded and model 
parameters sampled every 100 generations, thereafter pro- 
ducing a sample size of 5000 from the posterior. Convergence 
of the MCMC was judged satisfactory based on the 
Gelmin-Rubin test and genealogy comparison tool imple- 
mented in the Graphical User Interface. 

Recombination events were recognized when the pos- 
terior probability of an import was above 95%. Measures 
of recombination rate computed were as follows: p/9, the 
ratio of recombination and mutation rates; and r/m, the 
ratio of probabilities that a given nucleotide will be 
altered through recombination and mutation. 

Population genetics analysis 

DnaSP version 5.1 (Librado and Rozas 2009) was used to 
calculate summary statistics for sequences representing the 
220 bacterial isolates from field sites A and B. Statistics 
calculated included G+C content, the number of polymorphic 
(segregating) sites (S), and the haplotype (gene) diversity 



(Hd). The average number of pairwise nucleotide differ- 
ences per site (n), number of synonymous substitutions per 
synonymous site (Ttg), number of nonsynonymous substi- 
tutions per nonsynonymous site (n-^), and the ratio of 
nonsynonymous to synonymous substitutions (dN/dS) 
were calculated with Jukes-Cantor correction. Tajima's D 
test of neutrality was calculated based on segregating sites. 
The following statistics of genetic differentiation and 
gene flow were calculated: average number of nucleotide 
substitutions per site between populations (Dxy), the 
sequence-based statistic of genetic differentiation (-f^sx*) 
(Hudson et al. 1992a) with permutation tests (10,000 
replications) to assess statistical significance, the effective 
number of migrants (Nm) and the fixation index (Fst) 
described by Hudson et al. (1992b). The codon-based 
Z-test of purifying selection implemented in MegaS was 
carried out using the Pamilo-Bianchi-Li method, pairwise 
deletion, and 500 bootstrap replications. 

Relative effectiveness 

Relative nitrogen-fixing effectiveness (RE) of bacterial iso- 
lates was assessed using soybeans grown in Leonard jars 
(Vincent 1970) according to methods and conditions 
described by Bromfield et al. (2010). After 35 days, shoots 
were removed and dried to constant weight at 80°C. 

RE of bacterial isolates was calculated as ([x — x°]/ 
[x"^ — x°]) X 100, where x, x°, and x'^ are the mean shoot 
dry weights of, respectively, plants inoculated with a given 
bacterial isolate, uninoculated plants, and plants inocu- 
lated with an effective reference strain. RE values were 
derived from means of five replicates (two plants/repli- 
cate) for each bacterial isolate. 

Results 

Description of Bradyrhizobium populations 
at field sites A and B 

Details of the origin (field site, soybean cultivar) of the 
220 bacterial isolates are listed in Table S2. The MPNs of 
viable symbiotic bacteria were averaged over two soybean 
cultivars used as trap hosts. MPN values (xlO^ g^^ soil) 
were 8.5 (site A) and 30 (site B) indicating large census 
population sizes of soybean-nodulating bacteria at both 
field sites at the time of soil sampling. 

The six concatenated core gene sequences (220 isolates) 
were classified into 35 unique STs. Isolates representing 
totals of 15 and 27 STs were recovered from sites A and 
B with seven STs common to both field sites (Table 2). 
At site A, isolates of ST 15 and ST21 were predominant 
and accounted for -62% isolates at this site. At site B, 
isolates of six STs accounted for -61% of the isolates. 
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Table 2. Frequency and relative effectiveness (RE) of isolates repre- 
senting 35 STs from field sites A and B. 

No. isolates from 

Site A Site B 

Soybean cultivar 
node RE 



Lineage 


ST 


group M 


0 


M 


0 


Frequency 


(Isolate No.) 


1 
1 


^1 

D I 


nod 


1 






Z 


n nnQ 
u.uuy 


DU {n\j 1 ou) 


1 
1 




nod 


1 






7 


n nA(=; 


DZ i.rH—' 1 yy^ 




1 A 


nod 






1 D 




U.U/ J 






1 J 


nod 


13 


1 D 


C 
D 




n 1 f^zi 

U. 1 


fT^ni n7^ 

DZ 1 U / / 




1 

1 o 


nod 








1 


U.UUI) 






1 7 
1 / 

1 Q 
I o 


nod 
nod 


1 






1 
1 


n nrm 
n firm 

U.UU J 






1 Q 


nod 








1 
1 


U.UU J 






zu 


nod 


3 








U.U 1 ^ 






Z 1 


nod 


24 


1 7 
1 / 


1 1 
1 1 


1 


n 9/1 1 

U.Z4 1 


1 ZD I^UIVIDD; 




ZZ 


nod 








7 


n n^9 






9Q 
Z J 


nod 






1 
1 




U.UUD 








nod 




7 






n n^9 

U.U jZ 


U \\JKJOD) 




34 


nod 


3 


c 
D 






U.UJD 


zu \uiviy^ 




JD 


nod 




1 






U.UUd 


1 3D (,uuyy) 


1 V 


1 
1 


nod 








z 


u.uuy 




1 V 


Z 


nod 








1 


n nnc; 

U.UUI) 


Q7 ('UIOI Qn^i 

y / i.nu 1 yu^ 


IV 


.3 


nod 






1 u 


1 1 
1 1 


n nQf; 
u.uyo 


1 c;9 /uirM Qc;^ 
1 DZ [n'U \ od) 


IV 


I- 


nod 


1 








n nnc; 

U.UUD 


19 3 ('^^^/Ic;n^ 

1 ZJ l^UIVIDU; 


i\/ 

IV 


e: 
D 


nod 








1 


n nnc; 

U.UUD 




l\/ 

IV 


t; 

D 


nod 








T 
Z 


n noQ 
u.uuy 




IV 


7 


nod 


2 




2 


2 


0.027 


60 (0M17) 


IV 


8 


nod 






4 




0.018 




IV 


9 


nod 






1 


1 


0.009 


76 (HM155) 


IV 


10 


nod 








1 


0.005 




IV 


11 


nod 


5 






4 


0.041 


22 (0M15) 


IV 


13 


nod 


1 




2 




0.014 




V 


12 


nod 








1 


0.005 




V 


24 


nod 








1 


0.005 


7 (H0196) 


V 


25 


nod 








1 


0.005 


124 (H0172) 


V 


26 


nod 








1 


0.005 




V 


27 


nod 


1 


7 




1 


0.041 


127 (0061) 


V 


28 


nod 


1 






2 


0.014 


63 (OM28) 


V 


29 


nod 




1 






0.005 




V 


30 


nod 




2 






0.009 





Bacterial isolates were from soybean cultivars Maple Glen (M) and AC 
Orford (0). nodC groups were inferred by ML analysis. STs shown in 
bold share the same multilocus genotype as inoculant strains. 



Bradyrhizobium strains known to have been used in 
soybean inoculants up to 1999 (the time of soil sampling/ 
bacterial isolation) are shown in Table 1. Strain USDA136 
had the same ST as strain USDA122 (ST4); 532C and 
USDA138 also had an identical ST to B. japonicum 
USDA6^ (ST24). These inoculant strains (classified as ST4 
and ST24) are subsequently referred to as USDA136 and 
532C, respectively. 

A minority of isolates recovered from both field sites 
exhibited the same STs as inoculant strains. At site A, 



only one isolate (ST4) had the same genotype as an inoc- 
ulant strain (USDA136), but it is unlikely that this isolate 
originated from inoculation because soybeans were grown 
without inoculation at site A when USDA136 was 
brought into service in 1980 (Table 1). At site B, -31% 
isolates had identical STs to inoculant strains. These were 
one isolate of ST24 (532C), two isolates of ST9 
(USDAIIO), 21 isolates of ST3 (USDA142), and 10 iso- 
lates of ST32 (61A101). As inoculant strain 61A101 
(ST32) was discontinued in 1987, several years before soy- 
beans were cultivated at site B, it is highly unlikely that 
isolates of ST32 originated from inoculation. For similar 
reasons, it was not surprising that isolates with the same 
ST as inoculant strain 61A124 (years of service, 1974- 
1986) were not encountered at either field site (Table 1). 
Adjusting for years of service of inoculant strains and 
inoculation history, isolates of putative inoculant strains 
were not recovered from site A, whereas about 20% were 
recovered from site B. 

Summary statistics 

Summary statistics for nucleotide sequence data (220 iso- 
lates, field sites A and B) are shown in Table 3. There 
were seven allelic types (h) for the symbiotic nodC gene 
and between 9 (dnoK) and 17 (rpoB) for core genes. The 
lower G-C content of the nodC gene (57%) relative to 
the average of the six core genes (65%) is consistent with 
the hypothesis of ancient symbiosis island integration into 
the Bradyrhizobium chromosome following acquisition 
from an external source (Kaneko et al. 2011). 

Values of dN/dS were <1 indicating that all loci were 
subjected to purifying selection. This conclusion is sup- 
ported by significant {P = 0.000) values of the Z-test sta- 
tistic rejecting the nuU hypothesis of dj^ = ds- Strong 
purifying selection is consistent with essential functions 
of core genes and with the functions of the nodC gene 
generating signal molecules essential for initiation of 
plant-bacterial symbiosis. Tajima's D values were not 
significantly different from 0 for all loci suggesting no 
significant departure from a standard neutral model with 
purifying selection (Feil 2010). 

Population structure and ancestry 

Different STRUCTURE models were explored with K 
(number of ancestral lineages) ranging from 2 to 7. Data 
for multiple STRUCTURE runs using the admixture 
model with independent allele frequencies (220 bacterial 
isolates, sites A and B) indicated that K = 5 was optimal 
based on criteria described in Methods; at values of K 
between 2 and 5, each additional K contributed up 
to 100% genetic material to multiple isolates, whereas at 
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Table 3. Summary statistics for core and symbiotic (nodC) partial gene sequences of 220 bacterial isolates from field sites A and B. 



Locus 


Sequence length (bp) 


GC content (%) 


h 


S 


n 






dW/dS 


Z 


D 


Core gene 






















atpD 


435 


65.3 


10 


43 


0.0258 


0.0778 


0.0095 


0.122 


4.18 


1.502 


ginl! 


555 


63 5 


1 6 


91 


0 0288 


0 0997 


0 0097 


0 097 


5 27 


-0 008 


recA 


462 


67.3 


11 


73 


0.0353 


0.1404 


0.0025 


0.018 


6.60 


0.836 


gyrB 


618 


64.0 


10 


88 


0.0343 


0.1199 


0.0097 


0.081 


6.19 


1.182 


rpoB 


771 


63.9 


17 


111 


0.0300 


0.1033 


0.0084 


0.082 


7.11 


0.609 


dnaK 


369 


66.0 


9 


50 


0.0309 


0.1127 


0.0092 


0.082 


4.03 


0.898 


Concatenated 


3210 


64.8 


35 


456 


0.0308 


0.1077 


0.0083 


0.077 


12.39 


0.791 


Symbiotic gene 






















node 


726 


57.3 


7 


118 


0.0188 


0.0763 


0.0058 


0.076 


7.32 


-1.191 



h, number of haplotypes (alleles); S, number of polymorphic (segregating) sites; nucleotide diversity estimated for all sites (tt), synonymous sites 
(tts), and nonsynonymous sites (tin) with Jukes-Cantor correction; dW/dS, ratio of nonsynonymous to synonymous substitutions; Z, codon-based 
test of purifying selection (500 bootstrap replications); all values are significant (P = 0.000); D, Tajima's D based on segregating sites; all values 
are not significantly different from 0 (P > 0.1). 



values of K between 6 and 7, each new K contributed 
<10% genetic material only to a single isolate of ST24 
(532C). Although the marginal likelihood plateaued at 
K = 5 (Table S8), this may be coincidental given known 
problems with the harmonic mean estimator (Lartillot 
and Philippe 2006). 

Data for ancestry and admixture levels of the 220 iso- 
lates are shown in Figure 1. Isolates with >50% genetic 
material from one of the five ancestral lineages are con- 
sidered to be representative of that lineage. Among the 
sample of isolates, those assigned to lineage II were most 
abundant (120 isolates), whereas those in lineage I (12 
isolates) were the least. Isolates of lineage I and III were 
each encountered at one of two sites and had 100% of 
their genetic material derived from the respective ances- 
tral lineage. The majority of isolates in lineages II, IV, 
and V (encountered at both field sites) exhibited ances- 
tries that were homogeneous. Of 172 isolates representing 
these three lineages, only 20 isolates (12 STs) exhibited 
mixed ancestries suggesting that limited interlineage flow 
of core genes had taken place. The isolate representing 
ST24 (532C) in lineage V was unusual in that it was 
highly admixed, possessing genetic material inherited 
from four ancestral lineages. All 20 isolates with mixed 
core gene ancestries were from soybean cultivar O. This 
appears consistent with data indicating higher levels of 
haplotype, nucleotide, and gene diversity (core and nodC 
gene sequences) for isolates from soybean cultivar O rela- 
tive to cukivar M (Table S9). 

Phylogenetic relationships between 
bradyrhizobia from field sites A and B and 
reference strains 

The consensus network graph based on 1000 replicate 
bootstrap trees from ML analysis of concatenated core 



gene sequences representing isolates from sites A and B 
and reference strains is shown in Figure 2; the corre- 
sponding ML tree is shown in Figure SI. 

Using a relatively low value of 0.1 for the consensus 
network threshold, moderate reticulation was evident, 
particularly on inner edges. Reticulation reflects phyloge- 
netic uncertainty and may indicate recombination. 

The network graph and corresponding ML tree show 
well-defined clusters of isolates corresponding to the five 
lineages inferred by STRUCTURE. 

Graph edges within clusters corresponding to lineages 
I, II, and III show little reticulation. These clusters have 
no type strain representatives indicating that they are 
novel evolutionary lineages within the genus Bradyrhizobi- 
um. 

Lineage I was part of a clade with B. elkani and 
B. pachyrhizi. The closest relative of lineage II isolates was 
lineage V, whereas the nearest neighbors of lineage III iso- 
lates were B. liaoningense and B. yuanmingense. 

Little reticulation was also evident in the cluster of iso- 
lates representing lineage IV that included inoculant 
strains USDAllO, USDA136, and USDA142. In contrast, 
the lineage V cluster containing B. japonicum USDA6^ 
showed moderate reticulation. Lineages V and IV corre- 
spond to "B. japonicum" groups 1 (USDA6^) and la 
(USDAllO) defined on the basis of DNA homology (Mol- 
lis et al. 1981). Inoculant strain 61A124 was not recovered 
from field sites A or B, but was placed in a distinct line- 
age. 

Randomization tests were used to assess the extent of 
congruence between ML trees reconstructed for each of 
the six core genes. The 11 STs (2, 4, 9, 11, 12, 15, 16, 25, 
30, 31, and 35) used in the analysis were selected to rep- 
resent the different lineages in a ClonalFrame tree 
(Fig. 3a). Relative to 100 randomly generated trees, only 
the gyrB gene tree was significantly incongruent with trees 



2012 The Authors. Published by Blackwell Publishing Ltd 



2949 



Microevolution of Bradyrhizobium populations 



J. Tang et al. 




II 



III 



IV 



Figure 1. Core gene ancestries of 220 Bradyrhizobium isolates from field sites A and B inferred by STRUCTURE. Proportions of ancestry from 
lineage I (red), II (green), III (blue), IV (yellow), and V (magenta) was inferred assuming K = S ancestral populations. Each isolate is represented by 
one vertical line color coded according to the proportion of single nucleotide alleles that each isolate derived from one of the ancestries. 
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Figure 2. Consensus network graph of afpD-g/nll-recA-gyrB-rpoB-dnaK concatenated gene sequences (3210 bp) for Bradyrtiizobium reference 
strains and 35 unique STs representing 220 bacterial isolates from field sites A and B. Consensus network based on 1000 replicate bootstrap trees 
from ML analysis. Values on graph edges indicate confidence. Letters A, B, and AB following ST numbers designate site of origin. STs connected 
by underscores have the same ST as indicated reference strains; inoculant strains are shown in bold. Gray shaded clusters and roman numbers 
designate isolates in lineages inferred by STRUCTURE. Lineages V and IV correspond to 6. japonicum groups 1 and la, respectively; lineages I, II, 
and III are novel. Superscript T designates type strains; scale bar = 1 % substitutions per site. 
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Figure 3. Majority rule consensus trees 
inferred by ClonalFrame showing the effect of 
recombination on branch length and branching 
order. Trees are with (a) and without (b) 
correction for recombination and share the 
same scale in coalescent units. Data are for six 
core genes representing 220 bacterial isolates 
from sites A and B. Roman numbers designate 
STRUCTURE lineages I to V. The placement of 
ST24 is indicated. 




of the other five genes irrespective of whether reference 
strains were included in the analysis (Fig. S2). Moreover, 
a ML tree (reconstructed for all unique gyrB sequences; 
Fig. S4) was markedly incongruent with the ML "species" 
tree of concatenated gene sequences (Fig. SI), suggesting 
that extensive intragenic recombination had occurred at 
the gyrB locus: a minimum of 10 lateral transfer events 
were inferred for STs in lineages I, II, IV, and V. 

ClonalFrame analysis of microevolution and 
recombination 

Microevolution and the role of recombination in genetic 
diversification were further investigated by ClonalFrame 
analysis of all 220 isolates from sites A and B. 

To investigate the effect of recombination on branch 
length and branching order, trees were compared with 
and without correction for recombination (Fig. 3). Time 
(path length in coalescent units) to the most recent com- 
mon ancestor (TMRCA) of all isolates was calculated for 
each tree. Based on five independent runs, TMRCA values 
were 1.33 with 95% credibility interval (CI) (1.00-1.70) 
for the recombination corrected tree (Fig. 3a) and 3.84 
with CI (3.47-4.23) for the tree without correction for 
recombination (Fig. 3b) indicating that the TMRCA of all 
lineages was almost three times shorter when allowance 
was made for recombination. Moreover, the branching 
order of the two trees differed markedly. For example. 
Figure 3a shows that the isolate of ST24 is placed within 
lineage V, but in Figure 3b, it is not. Similarly, lineages 
III and IV are monophyletic in Figure 3a, whereas in 
Figure 3b, they are not. 



The ratio of recombination events relative to mutation 
ip/O) estimated for the 220 isolates (five replicate runs) 
was 0.05 with CI (0.02-0.08) indicating that recombina- 
tion was considerably less frequent than mutation. The 
effect of recombination relative to mutation (r/m) was 
0.59 with CI (0.34-0.88) indicating that recombination 
introduced almost two times fewer substitutions than 
mutation. Because clonal expansion of specific bacterial 
genotypes, as observed for lineage II in this study (Fig. 4), 
may result in underestimation of recombination rates 
(Smith et al. 2000), the p/6 and r/m statistics were 
inferred in a separate analysis using unique STs only. 
Consistent with expectations, recombination rates based 
on STs only (p/0 = 0.09 with CI [0.04-0.14] and 
r/m = 1.00 with CI [0.59-1.53]) were higher than those 
inferred for all isolates. 

The tree in Figure 3a was analyzed to assess the distri- 
bution of recombination events (probability above 95%) 
on branches of sub-trees each representing one of the five 
lineages. Results are shown in Figure 4 and events found 
at each locus are listed in Table 4. A total of 21 events 
were found in the five sub-trees (Fig. 4). One recombina- 
tion event was found in lineage I, three in lineage II, two 
in lineage III, five in lineage IV, and 10 in lineage V. 
These levels of recombination are consistent with the 
extent of intralineage reticulation in the network graph 
(Fig. 2) and with the average levels of nucleotide diversity 
(concatenated core genes) of lineages II (0.0018), IV 
(0.0043), and V (0.0074) (Table 5). Twelve recombination 
events were found at the gyrB locus, whereas only between 
one and three were found at each of the remaining loci. 
These data are in agreement with the phi test (Huson and 
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Figure 4. Distribution of recombination events on brancines of sub-trees representing the five STRUCTURE lineages in Figure 3a. Lineages are 
labelled l-V. Font sizes of ST numbers on leaves are approximately proportional to the number of isolates represented. Asterisks denote STs with 
the same multilocus genotype as reference inoculant strains (see Table 1). STs on internal nodes are inferred ancestors and are italicized; x 
indicates that the ST was not present in the sample. Red numbers indicate the number of recombination events with posterior probability above 
95% on branches highlighted in red. Confidence values (%) for internal nodes are: green (90-100), blue (70-90), and black (less than 70). Scale 
bars represent coalescent units. 



Table 4. Number of recombination events inferred by ClonalFrame 
analysis of Figure 4. 



Core gene 



Lineage 


atpD 


g/nll 


recA 


gyrB 


rpoB 


dnaK 


Total 


1 


0 


0 


0 


1 


0 


0 


1 


II 


0 


0 


0 


3 


0 


0 


3 


III 


0 


1 


0 


0 


1 


0 


2 


IV 


0 


0 


2 


3 


0 


0 


5 


V 


1 


1 


1 


5 


1 


1 


10 


Total 


1 


2 


3 


12 


2 


1 


21 



Lineages l-V inferred by STRUCTURE. Only events with a posterior 
probability of 95% or greater are recognized. 

Bryant 2006) that detected significant recombination 
{P = 0.002) at the gyrB locus, with results of the congru- 
ence test (Fig. S2), and with higher levels of nucleotide 
diversity (lineages II, IV, and V) at the gyrB locus than at 



other loci (Table 5). The isolate of ST24 (lineage V), that 
was highly admixed, exhibited five events at atpD, glnll, 
recA, rpoB, and dnaK loci on the external branch from 
ST24 to the MRCA providing further evidence of recom- 
bination in the history of ST24 (Fig. 4). 

The topology of the sub-tree representing lineage II 
(Fig. 4) suggests that a recent clonal expansion of STs 15 
and 21 had taken place. To further investigate this issue, 
an external/internal branch length ratio test was 
performed on the recombination corrected tree generated 
from a ClonalFrame analysis (5 replicate runs) of the 
lineage II data (n = 120). As the lineage II dataset was 
smaller than the combined dataset, the mean length of an 
import was fixed at the value computed for the whole 
dataset. The results (Fig. S3) indicate that the tree of line- 
age II was unexpectedly star-shaped with a ratio that was 
significantly higher than expected based on coalescent 
simulations (0.66, P = 0.007), consistent with a contem- 
porary clonal expansion (partial selective sweep). 
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Table 5. Nucleotide diversity of selected Bradyrhizobium lineages from field sites A and B. 
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Gene 


Lineage 1 


1 (n = 120) 




Lineage IV (n = 53) 




Lineage V (n = 19) 




h 


5 


n 


ii 5 


n 


h 


S 


n 


Core genes 


















atpD 


1 


0 


0 


2 1 


0.00033 


4 


16 


0.00588 


gInW 


3 


2 


0.00093 


5 5 


0.00209 


5 


15 


0.00431 


recA 


1 


0 


0 


6 18 


0.00832 


2 


9 


0.00208 


gyB 


5 


57 


0.00814 


5 56 


0.00934 


5 


58 


0.02202 


rpoB 


3 


2 


0.00037 


6 13 


0.00319 


5 


14 


0.00393 


dnaK 


1 


0 


0 


2 1 


0.00138 


4 


9 


0.00383 


Concatenated 


10 


61 


0.00177 


12 94 


0.00426 


8 


121 


0.00739 


Symbiotic gene 


















node 


2 


1 


0.00002 


2 1 


0.00070 


3 


2 


0.00066 



n, number of sequences. For explanation of other symbols see Table 3. 



Further analysis was done to estimate the TMRCA of 
lineage II relative to lineages IV and V from the recombi- 
nation corrected tree computed for all isolates (Fig. 3a). 
The TMRCAs (based on branch lengths) of lineages IV 
and V were similar, but about twice that of lineage II. 
This suggests that lineages IV and V may have appeared 
at about the same time, whereas lineage II may have 
emerged more recently. The exceptionally low nucleotide 
diversity at most loci of lineage II (Table 5) relative to 
lineages IV and V appears consistent with an inferred 
recent origin of this lineage, with a relatively low level of 
recombination and with the diversity purging effect of a 
partial selective sweep. 

The extent of genetic differentiation and divergence 
between lineages II, IV, and V was further analyzed using 
population subdivision and gene flow statistics (Table 
SIO). Based on concatenated core gene sequences, the 
high values of population subdivision statistics (Dxy, 
KsT*> and fsr)> the high number of fixed differences, and 
the low effective number of migrants (Nm) all indicate 
substantial differentiation and genetic isolation of lineages 
II, IV, and V. These statistics suggest that the extent of 
genetic divergence and isolation was greatest for lineages 
II versus IV, IV versus V, and least for sister lineages, II 
versus V, consistent with the relative placement (branch 
lengths) of these lineages in the recombination corrected 
genealogy (Fig. 3a). 

Relationships between bradyrliizobia 
associated with soybeans and native 
legumes 

Using two soybean cultivars as trap plants, 148 isolates of 
soybean-nodulating bacteria were obtained from root- 
zone soils of native legumes, A. bracteata (29 isolates) and 
D. canadense (119 isolates). Relatively few nodules 



(between 7 and 25 per plant) were elicited on roots of 
trap plants. Symbiotic bacteria associated with these 
native legumes varied with regard to nitrogen-fixing effec- 
tiveness on soybeans, as judged by variation in nodule 
size (2-10 mm), interior nodule color (white/green to 
pink), and low to intermediate overall relative effective- 
ness (RE) values (Table 6). 

Phylogenetic trees of recA (all isolates) and concate- 
nated lecA-dnaK (selected isolates) partial gene sequences 
representing bradyrhizobia from native legumes and 
soybean sites A and B are shown in Figure S5. Soybean- 
nodulating bacteria from D. canadense exhibited the same 
recA-dnaK genotype as lineage V isolates {B. japonicum 
group I); two of these isolates had the same genotype as 
inoculant strain 532C and type strain, USDA6^ (ST24). 
In contrast, all sequences of soybean-nodulating bacteria 
from A. bracteata clustered with bacteria in lineages I and 
II. 

Plant tests with A. bracteata showed that 8 of 10 and 9 
of 10 plants were nodulated following inoculation with 
suspensions of soil from field sites A and B, respectively 
(data not shown). Although soybean-nodulating bacteria 
in lineages III and IV were not isolated from the two 
native legumes used in this study, results of plant tests 
showed that isolates representing lineage III (0099) and 
lineage IV (HM155 and USDAllO) from soybean field 
sites were capable of eliciting nodules on both D. canadense 
and A. bracteata (data not shown). Moreover, database 
searches showed that several rpoB (partially overlapping) 
and dnaY. sequences of bradyrhizobia from diverse 
legumes native to North America had 98-99% identity 
with sequences of isolates representing each of the five 
lineages defined in this study. 

ML gene trees of unique nodC and «i/H partial gene 
sequences representing bacterial isolates from sites A and 
B and isolates from A. bracteata and D. canadense are 
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Table 6. Characteristics of symbiotic bradyrliizobia from root-zone soils of Amphicarpaea bracteata and Desmodium canadense using soybeans 
as trap plants. 



Root-zone soil from 
native legume 



Trap host (soybean cultivar) 



AC Glengarry 



AC Orford 



Ave. nodules per plant 



RE 



Ave. nodules per plant 



RE 



Core lineage/nocyC-n/f H group 
(no. of isolates) 



A. bracteata 
D. canadense 



16.7 
24.7 



5.2 

32.9 



6.8 

24.0 



16.3 Lineage l/nod-n/f II (10); 

Lineage W/nod-nifl (19) 

51.4 Lineage V/nod-n/T I (119) 



Relative effectiveness (RE) values are based on uninoculated plants supplied with 1 % (w/v) KN03 instead of an effective reference strain. Lineage 
assignment based on analysis of the recA (148 isolates), dnaK, nodC, and n/f H (selected isolates) partial gene sequences shown in Tables S4 and 
S5. 



shown in Figure 5. The topologies of these symbiotic gene 
trees are highly congruent consistent with 'hitchhiking' of 
node and nifli genes that are located in close proximity 
on the chromosomal symbiosis island region (Table SI) 
triggered by periodic selection events favoring particular 
adaptive symbiotic gene variants. While we were unable 
to generate a nodC sequence fragment (726 bp) of the soy- 
bean-nodulating bacterium, B. Haoningense LMG18230 , a 
partially overlapping sequence, available in Public data- 
bases (JN993965), showed 99% identity with nodC 
sequences of B. japonicum USDA6^ (AP012206) and 
USDAllO (BA000040) in the nod I group; a nifli 
sequence of LMG18230'^ (EU818925) was placed in the 
nifl group (not shown). 

Comparison of the topologies of the ML symbiotic 
gene trees with the network graph and ML tree of core 
genes (Fig. 2; Fig. SI) indicated that they were not in 
complete agreement, although it should be noted that the 
lack of resolution of inner edges in the network graph 
was a confounding factor. 

In both symbiotic gene trees, sequences of soybean-no- 
dulating bacteria from sites A and B and from native 
legumes were placed in two highly supported groups 
(nod-nif I and nod-nif II). The nod-nif I group included 
all soybean-nodulating bacteria in lineages II, III, IV 
(5. japonicum la), and V (B. japonicum 1) together with 
the lineage defined by inoculant strain 61A124, whereas 
the nod-nif II group was defined by B. elkani and lineage 
I. This nonrandom clustering of core gene lineages in 
symbiotic gene groups may indicate that the inheritance 
of symbiotic gene variants is predominantly by vertical 
transmission. Indeed, the phi test and RDP version 4 
(Martin et al. 2010) suite of recombination detection 
methods (RDP, GENECONV, BootScan, Chimaera, 
MaxChi, SiScan) did not detect significant intragenic 
recombination at either nifH or nodC loci of isolates 
from sites A and B, native legumes, and reference taxa of 
soybean-nodulating bacteria. The substantially lower 



levels of nucleotide diversity at nodC relative to core 
gene loci of lineages II, IV, and V (Table 5) might also 
be a further indication of low levels of recombination at 
the node locus and/or a recent origin of this gene. 
Moreover, the recombination-corrected genealogy of core 
loci (Fig. 3a) suggests that lineages II, III, IV, and V (in 
nod-nif group I) are monophyletic consistent with a 
hypothesis of predominantly vertical transmission of 
symbiotic genes. 

Tests were done to assess the relative nitrogen-fixing 
effectiveness (RE) of 17 isolates representing lineages I to 
V from field sites A and B using soybean cultivar O as 
test plant (Table 2). These isolates exhibited considerable 
variation in effectiveness relative to reference strain 
USDAllO and uninoculated plants. For example, RE val- 
ues of isolates in lineage III varied between 0% (isolate 
0085) and 136% (isolate 0099). Similarly, isolates in 
lineage II that were found to be associated with A. bracte- 
ata and predominated at both soybean sites varied 
between poorly effective (RE, 52%) for isolate OO107 
and highly effective (RE, 126%) for isolate OM55. Isolate 
H0185 (lineage IV) that has a multilocus genotype corre- 
sponding to inoculant strain USDA142 was the most 
effective isolate tested (RE, 152%). Several isolates that 
had the same genotype as an inoculant strain were less 
effective than reference strain USDAllO. In particular, 
isolate H0196 (ST24, Hneage V) was ineffective (RE, 7%) 
and possessed the same core and symbiotic genotype as 
the effective inoculant strain 532C and two isolates of 
soybean-nodulating bacteria from D. canadense. Similarly, 
isolate HO 186 (ST32, lineage I) from site B that was 
poorly effective (RE, 52%) had the same genotype as 10 
isolates from A. bracteata as well as effective commercial 
strain 61A101 that was taken out of service before soy- 
beans were grown at site B (Table 1). Isolate HM155 
(lineage IV) with the USDAllO genotype was also less 
effective than the culture of USDAllO used as reference 
in these tests. 



2954 



2012 The Authors. Published by Blackwell Publishing Ltd. 



J. Tang et al. 



Microevolution of Bradyrhizobium populations 




Figure 5. ML phylogenetic trees of nodC (726 bp) and n/fH (669 bp) partial gene sequences representing reference taxa, soybean-nodulating 
bacteria from field sites A and B and from native legumes (Amphicarpaea bracteata and Desmodium canadense). The nodC phylogeny is based 
on unique sequences representing 220 bacterial isolates (sites A and B) and reference strains shown in Figure 2. The n/fH phylogeny is based on 
reference strains and selected isolates. Both nifH and nodC trees include representative isolates from A. bracteata in lineages I and II and from D. 
canadense in lineage V (see Table SB for isolate/sequence accession numbers). Bootstrap values >50% (1000 nonparametric replications) are 
indicated at nodes. Scale bar represents estimated substitutions per site. 



Discussion 

Soybeans are colonized by bradyrhizobia 
originating from native legumes 

In this study, the use of MLST of six core gene sequences 
as well as the combination of STRUCTURE, ClonalFrame, 
and conventional ML phylogenetic analyses indicated that 
the Bradyrhizobium populations associated with soybeans 
at two field sites (A and B) with contrasting histories of 
cultivation and inoculation were highly structured: all 
analyses supported division of 220 bacterial isolates in five 
lineages corresponding either to B. japonicum groups 1 
(lineage V) and la (lineage IV) or to one of three novel 
lineages (I, II, and III) within the genus Bradyrhizobium. 

Consistent with expectations based on early reports of 
variable, but often poor persistence of introduced strains 
in the years succeeding soybean inoculation (reviewed by 
Keyser and Li 1992 and Streeter 1994), none of the iso- 
lates from site A and about 20% from site B (the only site 
where soybeans had been inoculated recently) were found 
to have originated from inoculation sources. Despite a 
history of recurrent Bradyrhizobium inoculation at site B, 
only isolates with multilocus genotypes (ST) correspond- 



ing to USDAllO, USDA136, and USDA142 were attrib- 
uted to inoculation sources, and of these, only USDA142 
showed evidence of clonal expansion farther emphasizing 
that most introduced strains lacked traits for successful 
establishment and were rapidly purged from the legume 
crop ecosystem. 

A study in the genus Mesorhizobium indicated that the 
appearance of competitive symbiotic bacteria following 
the introduction of an exotic crop legume and inoculant 
strain into a new environment was due to resident soil 
bacteria acquiring a symbiosis island from the introduced 
strain by lateral transfer (Nandasena et al. 2007). Our 
data for the genus Bradyrhizobium provide an alternative 
explanation for the rapid displacement of inoculant 
strains by resident soil bacteria and strongly suggest that 
soybeans were preferentially using bacterial symbionts 
that possessed the pre-existing capacity to nodulate the 
exotic host and originated from legumes native to eastern 
Canada. This was based on phylogenetic analyses of two 
core (recA and dnaK) and two symbiotic (nifli and nodC) 
gene sequences that placed soybean-nodulating bacteria 
from A. bracteata and D. canadense in the same core and 
symbiotic gene lineages as isolates encountered at soybean 
field sites. In particular, isolates in core lineage II were 
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distinct from named species and inoculant strains, pre- 
dominated at soybean sites and were associated with 
A. bracteata. The contemporary clonal expansion of this 
lineage, as evidenced by the topology of the recombination- 
corrected genealogy and significant external to internal 
branch length ratio test (Fig. 4 and Fig. S3), was domi- 
nated by a few closely related genotypes suggesting that 
these bacteria, in contrast to most inoculant strains, pos- 
sessed symbiotic fitness or adaptive characteristics and 
were selected by the host plant (i.e., a partial selective 
sweep). 

Although we did not recover soybean-nodulating bacte- 
ria in core lineages III and IV [B. japonicum la) from 
native legumes, plant infection tests indicated that isolates 
representing these lineages from soybean field sites were 
capable of nodulating both A. bracteata and D. canadense. 
Moreover, legumes native to east North America {Apios 
americana and Desmodium glutinosum) were reported to 
harbor rare genotypes similar to USDAllO (lineage IV) 
based on 16S and 23S rRNA sequence analysis (Parker 
1999). 

Interestingly, two isolates from D. canadense and 10 
isolates from A. bracteata had the same core and symbi- 
otic genotypes as, respectively, inoculant strains 532C 
(5. japonicum 1; lineage V) and 61A101 (lineage I; in a 
clade with B. elkani). Strain 61A101 was originally iso- 
lated from soybeans grown in the United States 
(Table 1), whereas 532C was isolated from soybeans 
grown in Brazil that were inoculated with strains from 
the United States (Santos et al. 1999), suggesting that 
both of these strains might have originated from 
legumes native to North America. Further support for 
this proposition was provided by our finding that a 
minority of isolates from field site B had the same ST 
and symbiotic genotype as, respectively, 532C and 
61A101, but exhibited poorly effective nitrogen-fixing 
phenotypes atypical of inoculant strains that are invari- 
ably selected for high symbiotic effectiveness with soy- 
beans. Moreover, inoculant strain 61A101 was taken out 
of service several years before soybeans were cultivated 
at site B (the only site where isolates with the 61A101 
genotype were encountered), further indicating that these 
isolates originated from bacterial symbionts that were 
associated with native legumes. 

Soybeans grown in the United States are often 
nodulated by resident soil bacteria exhibiting variable and 
sometimes sub-optimal nitrogen-fixing capabilities relative 
to inoculant strains (Kvien et al. 1981; Keyser and Li 
1992). Our data are consistent with these reports in that 
they provide an indication of considerable variation in 
the nitrogen-fixing capabilities of soybean-nodulating bac- 
teria associated with native legumes as well as among 17 



isolates representing different core and symbiotic gene 
lineages from soybean sites. In particular, our finding that 
some isolates representing lineages of indigenous origin 
(e.g., lineage II) were more effective than a reference 
inoculant strain has important practical implications and 
suggests that bradyrhizobia associated with native legumes 
may have potential for the selection of ecologically 
adapted and efficient nitrogen-fixing bacteria for soybean 
inoculation. In this regard, further detailed sampling and 
analysis of Bradyrhizobium populations associated with 
diverse taxa of native legumes is the topic of a subsequent 
report. 

Different genotypes of soybean (Cregan and Keyser 
1986; Lohrke et al. 1996) and A. bracteata (Marr et al. 
1997) have been reported to vary in their nodulation 
specificities for specific groups of symbiotic bacteria. Our 
data based on STRUCTURE analysis suggest that all iso- 
lates exhibiting mixed core gene ancestries were sampled 
exclusively by one of two soybean cultivars. Moreover, 
there were small, but consistent differences between soy- 
bean cultivars used as trap hosts with regard to haplotype, 
gene, and nucleotide diversity for both core and nodC 
gene sequences of isolates sampled from soybean field 
sites. Such plant selective effects may reflect the contrast- 
ing pedigrees of two soybean cultivars used for bacterial 
sampling in this study (Voldeng et al. 1996, 1999). 

Biogeography and to some extent, legume taxon have 
been implicated as factors that play a role in structuring 
Bradyrhizobium populations associated with crop legumes 
(Stepkowski et al. 2005, 2007). In our study, an indica- 
tion of spatial structuring was provided by the differential 
recovery of Bradyrhizobium lineages at soybean field sites 
280 km apart, but differing in soil characteristics, crop- 
ping, and inoculation histories. Assuming that random 
factors (i.e., genetic drift and sampling error) were not 
responsible for the observed differences, it is tempting to 
suggest that the exclusive recovery of lineages I and III 
from one of two sites might reflect regional differences in 
the distribution of native legume taxa and their associated 
populations of bradyrhizobia. Support for this proposi- 
tion was provided by our finding that A. bracteata and 
D. canadense were associated with distinct core, and to 
some extent, symbiotic gene lineages of soybean-nodulat- 
ing bacteria. Moreover, Spoerke et al. (1996) reported 
that distinct sub-populations of A. bracteata in a natural 
woodland environment were associated with different lin- 
eages of Bradyrhizobium defined on the basis of enzyme 
electrophoresis. Further investigation is obviously needed 
to elucidate the role of native legume biogeography (at 
local and regional scales) with regard to the spatial struc- 
turing of Bradyrhizobium populations associated with 
crop legumes. 
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Microevolution and recombination in 
bradyrhizobia associated with soybeans 

Homologous recombination in bacteria is a major evolu- 
tionary force and occurs by the three mechanisms of 
conjugation, transformation, and transduction (Didelot 
and Maiden 2010). In this study, use of multiple methods 
and analyses suggested that homologous recombination 
had occurred at core loci of soybean-nodulating bacteria 
from field sites A and B. The rate of recombination rela- 
tive to mutation {p/9) for the 35 STs from soybean sites, 
inferred by CIonalFrame reconstruction, indicated that 
recombination was about 11 times less frequent than 
mutation. This rate of recombination is significantly 
lower than an estimate of p/9 for three core genes in the 
symbiotic bacterium, Rhizobium leguminasarum (Tian 
et al. 2010), but is in line with estimates based on core 
and accessory (virulence) genes in lineages of bacterial 
pathogens such as Listeria (den Bakker et al. 2008) and 
Chlamydia (Joseph et al. 2011). Although our data sug- 
gest that mutation was the predominant evolutionary 
mechanism in soybean-nodulating bacteria, recombina- 
tion was found to have a significant effect on genetic 
diversification (r/m wl.O) and was reflected in the direct 
relationship between nucleotide diversity and total num- 
ber of inferred recombination events in selected lineages. 
The effect of recombination on genetic diversification was 
expected a priori, because recombination, unlike point 
mutation, affects multiple nucleotides at each event (Didelot 
et al. 2009). For similar reasons, recombination is consid- 
ered to represent a much faster mode of evolution than 
mutation (e.g., den Bakker et al. 2008). This was clearly 
illustrated by CIonalFrame reconstruction indicating that 
time to the most recent common ancestor (TMRCA), 
estimated on the basis of relative branch lengths, was 
almost three times shorter in the recombination-corrected 
genealogy of 220 isolates from soybean sites relative to 
the genealogy without correction. Recombination, partic- 
ularly when extensive, may have a significant confounding 
effect on phylogenetic inference by potentially obscuring 
evolutionary relationships (e.g., Lewis-Rogers et al. 2004). 
In our study, this effect was evidenced by the extent of 
phylogenetic uncertainty (reticulation) on inner edges of 
the "species" network graph of concatenated core gene 
sequences, and by significant distortion of branch lengths 
and branching order in the CIonalFrame genealogy with- 
out correction for recombination. 

Assuming that spatial separation is not a limiting fac- 
tor, gene flow is generally expected to occur more often 
between members of the same bacterial species or closely 
related species than between different species (Didelot 
and Maiden 2010). On the basis of STRUCTURE analysis, 
less than 10% of the isolates in five lineages were found 



to exhibit mixed ancestries, suggesting that interlineage 
transfer of core genes occurred infrequently. Such infre- 
quent gene flow between lineages is consistent not only 
with our data indicating substantial divergence and isolation 
of different evolutionary lineages (based on CIonalFrame 
reconstruction, population subdivision, and gene flow 
analyses) but also with novel lineages I, II, and III being 
considered distinct genomic species. Interestingly, several 
tests suggested that much of this gene flow was attribut- 
able to the gyrB locus (coding for type II topoisomerase 
involved in DNA replication, transcription, recombina- 
tion, and repair) that was inferred to have an unusual 
evolutionary history defined by multiple recombination 
events including interlineage transfers. 

Physical proximity is a major requirement for recombi- 
nation to take place between bacterial members in different 
lineages (Didelot and Maiden 2010). Pretorius-Guth et al. 
(1990) suggested that the root nodule may represent the 
most favorable natural environment for recombination 
between symbiotic bacteria. However, a recent report 
(van Berkum et al. 2012) indicated that mixed occupancy 
of bacterial genotypes in root nodules of field-grown soy- 
beans occurred at a frequency of less than 3% suggesting 
that mixed infections involving members of different lin- 
eages may be an extremely rare event. As the nutrient-rich 
rhizosphere of the plant host is capable of supporting 
large bacterial populations, it may provide conditions 
more suitable for recombination between members of 
sympatric lineages, particularly in view of evidence for 
population density-dependant quorum regulation in the 
bradyrhizobia (e.g., Loh and Stacey 2003). 

Differences in the topologies of symbiotic and core gene 
trees have been attributed to multiple lateral transfers of 
symbiotic loci among various Bradyrhizobium lineages (e.g., 
Stepkowski et al. 2005, 2007; Steenkamp et al. 2008), 
whereas other reports (e.g., Moulin et al. 2004; Menna 
and Hungria 2011) have implicated vertical transmission 
as the main mechanism of inheritance. In our study, phy- 
logenetic analysis of nodC and nifH sequences placed all 
core gene lineages of soybean-nodulating bacteria from 
field sites A and B, native legumes, and reference taxa in 
two highly supported symbiotic gene lineages (nod-nif I 
and nod-nif U) corresponding to B. japonicum and B. elkani, 
respectively. In agreement with the conclusions of Moulin 
et al. (2004) and Menna and Hungria (2011), our data 
provided several lines of evidence suggesting that vertical 
transmission was the predominant mechanism of inheri- 
tance of symbiotic gene variants in soybean-nodulating 
bacteria including the inability of multiple tests to detect 
recombination at either nodC or nifH loci and the appar- 
ent monophyly of the B. japonicum clade consisting of 
core lineages II, III, IV, and V (clustering in the nod-nif I 
lineage) in the recombination-corrected CIonalFrame 
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genealogy. Our inferences from ClonalFrame reconstruc- 
tion are also consistent with a proposal by Itakura et al. 
(2009) that an ancestor of "B. japonicurn" (B. japonicum 
clade in this study) may have diverged into different lineages 
after the acquisition of a symbiosis island region (genomic 
island). 

In conclusion, our study has not only provided new 
insights into the role of homologous recombination in 
the genetic diversification of lineages of soybean-nodulat- 
ing bacteria but also strongly suggests that soybeans in 
east North America are predominantly colonized by 
bradyrhizobia originating from native legumes, irrespec- 
tive of the practice of recurrent inoculation. Moreover, 
our study highlights the potential of sequence-based infer- 
ence methods for detailed studies on the microevolution 
and ecology of populations of symbiotic bacteria. The 
sequencing of more genes (up to whole genomes) may 
provide greater insight into the microevolutionary 
dynamics of these populations. 
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